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Abstract 

From a geometric perspective most nonlinear binary classification algorithms, including state 
of the art versions of Support Vector Machine (SVM) and Radial Basis Function Network 
(RBFN) classifiers, and are based on the idea of reconstructing indicator functions. We pro- 
pose instead to use reconstruction of the signed distance function (SDF) as a basis for binary 
classification. We discuss properties of the signed distance function that can be exploited in 
classification algorithms. We develop simple versions of such classifiers and test them on several 
linear and nonlinear problems. On linear tests accuracy of the new algorithm exceeds that of 
standard SVM methods, with an average of 50% fewer misclassifications. Performance of the 
new methods also matches or exceeds that of standard methods on several nonlinear problems 
including classification of benchmark diagnostic micro-array data sets. 



Machine Learning, Microarray Data 



1 Introduction 

Binary classification is a basic problem in machine learning with applications in many fields. Not 
only does binary classification have many potential direct applications, it is also the basis for 
many multi-category classification methods. Of particular interest are the applications in biology 
and medicine. The availability of micro-array and proteomic data sets that contain thousands 
or even tens of thousands of measurements have particularly made it important to develop good 
classification algorithms, since reliable use of these data could presumably revolutionize diagnostic 
medicine. Several binary classification algorithms have been developed and studied intensely over 
the past few years, most notable among these are the support vector machine (SVM) methods 
using radial basis functions and other functions as kernels. SVM methods have been shown to 
perform reasonably well in classifying micro-array data, demonstrating that the extraction of useful 
information from these large data sets is feasible. 
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We will begin in the next section with a geometric, rather than statistical, statement of the 
binary classification problem and our discussion will be restricted to the context of this geometric 
viewpoint. Nonlinear SVM methods, despite their geometric, maximal margin origin, have been 
developed based on the idea of reconstructing "indicator functions", as discussed by Poggio and 
Smale |14j . RBFN methods are also currently employed in this way. The indicator function is an 
object that encodes only the most primitive geometric information. We propose that a potentially 
better tool for classification is the "signed distance function" (SDF), an intrinsically geometric 
object that has been employed in several areas of applied mathematics. The geometric properties 
of the SDF make it advantageous for use in classification and we give examples of how these 
properties can be exploited. 

In this paper we also present preliminary test results for rudimentary implementations based on 
the idea of reconstructing the SDF from training data. We demonstrate that these non-optimized 
SDF-based algorithms outperform standard SVM (LIBSVM) methods on average by 50% (half as 
many misclassifications) on linearly separable problems. We also present a comparison of SDF 
classification with SVM method on the challenging geometric 4 by 4 checkboard problem in which 
the SDF-based method performs better. Finally, on 2 benchmark cancer-diagnosis micro-array 
data sets a nonlinear SDF algorithm performs just as well or better than highly-developed SVM 
methods. While these results are obviously not conclusive, they do demonstrate that the SDF 
paradigm is promising and worth further investigation. 

2 A Geometric Formulation and a Geometric Tool 

Suppose a set X C M" is partitioned hy A G X and its complement A'^. The set A might contain 
the set of test values that are associated with the presence of a disease, while A*^ contains those 
values that are not. For applications we may suppose that ^ is a reasonably nice set, e.g. has 
a smooth boundary. The problem of binary classification is then to determine the set A given a 
finite sample of data, i.e. for a set of points {xj}^^ we know whether Xi € A or Xi € A'^, for each i. 
The purpose of solving this problem is obviously predictive power; given a point x ^ X that is not 
among the given data {xi}^^, we wish to determine if x G A. In this paper we only consider this 
geometric formulation of the problem. While this formulation is obviously somewhat restrictive, 
it allows for geometric analysis and leads to a new class of methods that may be useful in some 
applications. 

One way to approach this problem mathematically that is common among nonlinear classifiers 
is to consider the indicator function ol A, la ■ X ^ 1}) defined by 



The known information is represented by {{xi, iA{xi)}^i and problem of binary classification is 
equivalent to reconstructing iA{x) from the data. 

Current methods of binary classification such as the SVM methods and RBFN methods work by 
attempting to approximate the indicator function la by regression over the known data, {(xj, iA{xi)}. 
This was not the conceptual origin of the SVM, i.e. finding a separating plane of maximal mar- 
gin, but is the basis for nonlinear "kernel trick" algorithms and efficient linear implementation as 
pointed out in [14j . In practice, the SVM constructed functions are smooth and ±1 are not the 
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only values in the range, thus x is interpreted as in A if the constructed function is positive at x 
and as being in if it is negative. 

Rather than using the indicator function la, we propose using the signed distance function 
(SDF) of A, denoted 6a (2;) which is the distance to the boundary, dA^ of ^ if x G ^ or minus the 
distance to dA if a; G A^, i.e. 



where d is a metric (distance function). 

Knowledge of 6^ is obviously sufficient to fully determine the set A, it carries more information 
than iA, and it has some smoothness. These and other properties can be used in classification 
and thus the SDF has advantages over the indicator function as a basis for binary classification 
algorithms. In fact we argue that SDF based classification, because it is more geometrical, is 
conceptually a more faithful generalization of the original SVM concept than existing nonlinear 
(kernel trick) SVM implementations. 

Binary classification based on the SDF can work in much the same way as indicator function 
based classification; one attempts to approximate the function 6^ using only the given data. If the 
value of hA is positive at a test point x, then x is predicted to be in A and if the value is negative, 
X is predicted to be in A'^. The approximation of bA can proceed similarly to that of la-, i-e. by 
various forms of regression (including SVM and RBFN regression). A practical difference is that 
LA is given explicitly at the data points, whereas hA at the data points must be derived from the 
data. We investigate simple methods for doing this and show that they give reliable results. We 
also show that properties of 6^ can be used to refine those estimates. The complexity of this task 
is no worse than that needed to perform the regression itself, hence no computational performance 
is sacrificed. 

3 Signed Distance Functions 

In some places 6a is called the oriented boundary distance function or the oriented distance function. 
Proofs of the following facts about 6a can be found in j^i- 

Fact 1 The function hA is Lipschitz continuous, with Lipschitz constant 1. 

In other words, |6a(x) — hA{y)\ < |x — y\, holds for all x,y G X. This implies that hA{x) is 
differentiable almost everywhere, i.e, D6a(x) exists except on a set of zero measure. It also implies 
that hA{x) belongs to the Sobolev space W^^^ for any 1 < p < 00. 

Fact 2 If hA is differentiable at a point x, then there exists a unique Px G dA such that 6a (x) = 
|x — Pxl and 



In the case it is unique, Px G dA is called the projection of x onto dA. In particular, |V6a(x)| = 1 
and Z56a(x) points from x toward Px. 




(2) 




Px — X 
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Fact 3 Let A be a subset o/R" with nonempty boundary dA. Then bA is a convex function if and 
only if A is a convex set. If A is convex, then 6^ is differentiable everywhere in A'^. 

Fact 4 If dA is of smoothness class , i.e. it is k times continuously differentiable, fc > 1, then 
for each y E dA there is a neighborhood V{y) of y on which bj\ is a function. 

If dA is C^ then at any point y £ dA we can define the unit normal vector n{y). 

Fact 5 Suppose dA is of smoothness class and n{y) is Lipschitz continuous. At any point 
y G dA, let V{y) be as in Fact^ Then for any x G y{y), x = Px + bA{x)n{Px). 

In particular, Px — x and DhA{x) are normal to dA at Px. 

Fact 6 Let Hy denote the mean curvature of dA at a point y E dA. Then wherever Hy exists it 
satisfies 

Hy = AbA{x), 

where A is the usual Laplace operator. 

The function bA has been used in various branches of applied math such as free boundary problems 
[HI U\ I16| I18j and grid generation for finite-element methods . It is intimately related to flow 
by mean curvature [S] and occurs in the solution of certain Hamilton-Jacobi partial differential 
equations [HI p. 163]. The geometric nature of the SDF connects it to well developed areas of 
geometry and analysis that can be expected to provide tools for both refinement and analysis of 
SDF based classification methods. 

4 SDF Classifiers 

4.1 Preliminary algorithms 

In the SDF paradigm the input training data are marked as to class, but they do not come marked 
with the values bA{xi), and hence these need to be approximated. A naive SDF algorithm then 
consists of two simple steps, with an optional third refinement step. 

• Approximate bA at the training data {xi}^^. Denote these approximations by {ftj}^^ 

• Approximate bA by a function Bu{x) on the entire domain through regression on D := 

• Use the constructed function Bd and properties of bA to improve the estimates and 
iterate. 

We now detail preliminary algorithms for these three steps and point to those areas that we consider 
important for further investigation. 
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4.2 Estimating 6^ at the data 

Let d denote a metric on X C M". Usually we will let d be the Euclidean metric, but we will also 
consider weighted distances. A reasonable and simple first approximation of 6^ at {xj}^^ is given 

by 

bi = P'{xi) = iA{xi) ■ m.in{d{xi,Xj) : iA{xj) 7^ LA{xi)}, 

i.e. the signed projection onto the (finite) data of opposite type. It is clear that bi has the correct 
sign and is a bound on bA{xi), i.e. 

\bA{xi)\ < \bi\. (3) 

It is easy to show by counter-example that obtaining more precise, yet rigorous, bounds on 
bA{xi) would require some assumptions on the shape of A. However, we can make some heuristic 
improvements in {bi}. For instance, consider 

bi = bi- bA{P'xi). 

Then we have 

\hA{xi)\ < \bi\ < \bi\. 

Now suppose that Xi e A and j/j G A'^ where yi is the closest data point to Xi in A'^ and Xi is the 
closest data point in A to yi. If dA is situated half way between Xi and y^ and is normal to yi — Xi, 
then bA{x) = d{x, y)/2. Let Cj denote the approximated signed distance of yi to the boundary, then 
we have: ^ ^ 

bAixi) = -bi = bi- -Ci. 

Then for any Xi and yi = P'xi we define 

b'i = bi- .5ci (4) 

where bi is the first approximation of 6^(2^1) and Ci is the first approximation of bA{yi). It can be 
demonstrated that, on average, 6^ is a better approximation of bA{xi) than 6j. 

4.3 Linecir classifier 

Wc will say that a binary classification problem in the context of our geometric formulation is 
linearly separable if dA is a hypcrplanc in M". In this case bj^{x) will be a linear function whose 
zero set is dA. To use the SDF for a linearly separable problem one would seek a linear function 
as B£){x), i.e. 

y = £{x) = W ■ X + C = WiXi + W2X2 + . . . + WnXn + c 

that fits the data {iCi, The most obvious choice for this approximation is to use the linear 

least squares approximation if m > n or projection (pscudoinvcrsc) if m < n, for which there are 
highly developed algorithms. In the linear tests below, we have used linear least squares regression. 
Prom §2, we have |V6^(a;)| = 1 wherever it exists so we could let 

\D£{x)\ = = 1 

be a constraint in the linear least squares approximation. 
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4.4 Nonlinear classifier 

If the problem is not linearly separable, then b^ix) will be a nonlinear function. To approximate 
it we should use some form of nonlinear regression on {ixi,bi)}^^. There are many options for 
nonlinear regression that could be used, including SVM and RBFN regression algorithms. One 
simple, yet appealing, choice for the nonlinear regression is the least squares regression discussed 
in We implemented this approach in the nonlinear tests reported below and so we recall it. 
Let K : X X X ^ M he a kernel that is symmetric {K{x,y) = K{y,x)) and positive-definite, 

i.e., 

k 

CiCjK{xi,Xj) > 0, 

for any k, any xi, . . . ,Xk and any ci, . . . , c^. In the tests below we use the Gaussian 

which is symmetric and positive definite. Let K be the square, positive-definite matrix with ele- 
ments Kij = K(xi,Xj). Let I be the identity matrix and b be the vector with coordinates 6j. Let 
7 > and let c be the solution of the linear system of equations: 

(K + mjl)c = h. (5) 

This problem is well-posed since (K -|- mjl) is strictly positive-definite and the condition number 
will be good provided that m'y is not too small. The number 7 can be viewed as a smoothing 
parameter. Given the solution vector c, the approximation, B£)(x), of bA{x) is given by 

m 

Bd{x) = ^^CiK{xi,x). 

i=l 

The choices of a and 7 in this approximation are discussed in JTl! and elsewhere. In tests 
discussed below, results were insensitive to 7 within a fairly large range. Good choices for a, which 
we found by cross-validation, were on the order of the mean inter-data distance. 

We emphasize that other regression methods could be used in connection with SDF-based and 
should be investigated. 

4.5 Iteration 

There are some possibilities for refining the initial data approximation One idea for nonlinear 
problems would is to let 

b[ = Bnixi) 

where is the regression obtained from {(xj,?)^)}. Then one could take b[ as a refinement of bi 
and then run the regression again. (For linear regression, this would repeat the original results.) 
We note that iterations of this type are related to the matrix eigenvalue problem for which there 
are well developed numerical techniques and which are amenable to stability analysis. 

Another possibility is to use Facts Inland [3 above to refine bi. From these facts it is reasonable to 
project Hi — Xi onto Df{xi). This approach is particularly appealing for problems that are known 
a priori to be linearly separable. We have successfully implemented this iterative approach in the 
linear tests below, using the projection onto w = Dl to refine {ftj}™]^. 
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5 Test Results 



5.1 Linearly separable problems 

We applied both non-iterative and iterative forms of the hnear regression signed distance classifier 
to three types of distributions: uniform, normal and skewed. In all of these tests the linear SDF 
classifier decisively outperforms the linear classifiers in the LIBSVM package as well as the linear 
Lagrangian SVM |13| and the Proximal SVM 

A linearly separable problem can be transformed by a linear change of coordinate to the problem 
where dA = {xn = 0}. Thus we use this problem for our tests. We performed the tests for n = 2. 
Data in the half space x„ > we labeled as in A and data with x„ < we labeled as in A'^. 

In the uniform distribution tests we let the domain be the square [— 1, 1] x [— 1, 1] C M^. For the 
normal distribution we used the standard normal distribution at the origin in M?. For the skewed 
distribution, we randomly choose points in the square [0, 1]^ C using the density p{xi) = 
then scaled them affinely to [—1, 1]^. 

In the tests, we considered training sets of size from m = 10 to m = 10, 000. For each m in the 
range we classified 50 distributions of m points. In each test we used a test set consisting of 4000 
points selected randomly according to the distribution type being tested. 

In Figures n we show comparisons of the SDF linear classifiers with the linear classifier from the 
LIBSVM along with the Lagrangian SVM fl3] and Proximal SVM f9|. In this plot csvm and usvm 
are routines from the LIBSVM package, psvm is the Proximal SVM and Isvm is the Lagrangian 
SVM algorithm. In these tests the iterated SDF method was iterated 5 times. The iterated SDF 
method gave a 10% to 15% decrease in the error over the non-iterated SDF method. The Lagrangian 
method was also iterated 5 times. The LIBSVM package methods automatically iterate. In our 
trials, the number of iterations for the LIBSVM methods increased with m and varied from 10 to 
5000 iterations. 

It can be seen that the SDF classifier has noticably smaller errors than either SVM method 
over a large range of number of training points m. Averaged over all 550 tests, the SDF-based 
classification produced 52% fewer misclassifications than the best SVM (LIBSVM c-SVM) method. 
(Average « 98% correct vs. 96% correct). 

5.2 The 4x4 Checkerboard Problem 

There are several benchmark nonlinear problems, but perhaps the prototype is the 4 by 4 checker- 
board. This geometric problem is interesting because it is known to be difficult. In this test a 
square is partitioned into 16 equal sub squares with alternate squares belong to two distinct types, 
black or white (Figure Following we used 1,000 randomly selected points in each training 
set and 40,000 grid points as the test set. 

In FigureOlwe show the results of applying an SDF-based classification scheme and the standard 
LIBSVM package to 100 independent training set. We used the nonlinear least squares regression 
with parameters a and 7 found by cross-validation on the training sets. Parameters for the SVM 
classification were chosen by precisely the same process. 

Note that the SDF-based method produces better results than the LIBSVM package. The mean 
correct % and standard deviation for the SDF method were 96.3% and .46%. For the SVM method 
the mean correct % was 94.5% with a standard deviation of .36%. 



7 




Figure 1: A comparison of classifiers on linearly separable data. In (a) the data are uniformly 

distributed, in (b) the data are normally distributed, and in (c) the data arc from a skewed distri- 
bution. The X-axis is log(m), where m is the number of data (training) points. Each point in the 
graph represents an average over 50 independent tests. 
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Figure 2: The 4 by 4 checkerboard problem classified by a nonlinear SDF-based classifier. 



The standard deviation for our SDF method is slightly higher than that for the LIBSVM 
package. This is perhaps an artifact of our naive implementation of a SDF-based method. 

We note that reported 97.1% accuracy on this problem using a Lagrangian SVM with 
100,000 iterations. 
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Figure 3: Comparison of performance by the LIBSVM package and a SDF-based method on the 
four by four checkerboard problem. 

5.3 Micro-array data sets 

We compare the nonlinear SDF classifier with existing studies of SVM performance on two standard 
micro-array data sets involving cancer diagnosis. The first set, called Prostate-tumor, consists of 
102 micro-array samples with 10,510 measurements each. Each patient represented by a sample was 
diagnosed independently for the presence of a prostate tumor. Of the 102 samples, 52 were from 
patients with a prostate tumor (20j. The second data set DLBCL consists of 77 samples with 5470 
variables each. Samples were taken from patients diagnosed for a lymphoma. Of the 77 patients, 
58 had Diffuse large B-cell lymphomas and 19 had follicular lymphomas jl9j . 

With the two data sets, we tested the SDF nonlinear classifier with least squares regression 
on the data sets using Leave One Out Cross Validation (LOOCV). Accuracy of the LOOCV tests 
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are shown in Table 1, with comparison of reported LOOCV performance from using a variety 
of SVM methods and the k-nearest neighbor (KNN) method. The percentages reported are the 
percentages of correctly identified samples. It is important to point out that jl2] showed that their 
error estimates did not change depending on the cross validation technique, and hence LOOCV is 
a robust estimate of performance when applied to these data. 



Data set 


KNN SVM SDF 


Prostate-tumor 
DLBCL 


85% 92.2% 94.1% 
87% 97.4% 97.4% 



Table 1: Results from applying SDF-based classification to the benchmark DLBCL and Prostate- 
tumor micro-array data sets compared with the performance of SVM methods and the k-nearest 
neighbor method reported in |22j. All results are for Leave-one-out cross validation. 

As seen in Tabled the performance of the SDF classifier matches that of the SVM method on 
the DLBCL data and exceeds it on the Prostate-tumor data set. 

In preliminary tests we found that performance on several LOOCV subsets once again was 
unaffected by 7 over a broad range 

10-^-10^^2 _ Based on this we simply set 7 = 10 for the rest 
of the tests. Values for a were determined within each loop of the LOOCV process based on mean 
interdata distances for the subset. For the DLBCL data values of a were approximately 2.5 x 10^ 
and a was generally about 3.4 x 10^ for subsets of the Prostate-tumor set. For both of these data 
sets the optimal results are actually robust with respect to changes in the values of a. 
In these tests we used a weighted distance 

da{x,y) = |a • (x - y)\, 

where | • | is the usual Euclidean norm and a is a vector of weights. Distance functions of this type 
were shown to be effective in high dimensional binary classification problems in P|. Specifically, we 
took a to be the absolute values of the correlation coefficients relating each variable to the indicator 
on the data set. This was recalculated in the LOOCV process for each subset, independent of the 
excluded sample. 

We note that in an independent set of experiments reported in [2], the nonlinear SDF-based 
classifier was compared to KNN, RBFN and SVM classifiers on five other cancer data sets. The 
following microarray data sets are involved: The Breast Cancer data set JSl] consists of 49 tumor 
samples with 7129 human genes each. There are two different response variables in the data set: 
one describes the status of the estrogen receptor (ER), and the other one describes the status of 
the lymph nodal (LN), which is an indicator of the metastatic spread of the tumor. Of the 49 
samples, 25 are ER+ and 24 are ER-, 25 are LN-|- and 24 are LN-. The Colon Cancer data set 
consists of 40 tumor and 22 normal colon tissues with 2000 genes each. The Leukaemia data set |l()j 
consists of 72 samples with 7129 genes each. Each patient represented by a sample has either acute 
lymphoblastic leukemia (ALL) or acute myeloid leukemia (AML). Of the 72 samples, 47 are ALL 
and 25 are AML. 

We tested the four classifiers in 100 independent trials on each of the data sets. In each trial, 
the data were divided randomly into a training set and a test set in a ratio of 2:1. We used Gaussian 
kernel functions for RBFN, SVM and SDF classifiers. For simplicity, we did not use any heuristic 
for the distance metrics of KNN, using the Euclidean distance. We claim that the classifiers are 
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comparable in this setting since they are under exactly the same condition: (i) They share the same 
training set and test set in each trial, (ii) SVM and SDF share the same 7 = 10"'^, (iii) SVM uses 
the weighted kernel matrix returned by SDF in each trial, (iv) SVM and RBFN use the same a, 
which is computed in each trial as the root mean square distance (RMSD) of the training data. 



Data Set 


KNN 


RBFN 


SVM 


SDF 


Breast cancer, ER 


.0912 


.0912 


.0869 


.0869 


Breast cancer, LN 


.2400 


.2425 


.2106 


.2100 


Colon cancer 


.2200 


.2143 


.1700 


.1662 


Leukaemia 


.0146 


.0321 


.0167 


.0167 



Table 2: Comparison of misclassification ratios averaged over 100 trials on randomly divided data. 

Table 121 shows the test error rates averaged over the 100 independent trials for each classifier. 
KNN with k = 1,9,3,5 neighbors achieves the best (in the averaging sense) generalization per- 
formance for the breast cancer data (ER), breast cancer data (LN), colon cancer data, and the 
leukemia data, respectively. Note that in actual use, k would have to be determined in some unbi- 
ased way from the training data only. We note again that the naive SDF method matches or beats 
the SVM method on all data. 

6 Discussion 

In order to make the SDF paradigm competitive with indicator function based classification, the 
main need seems to be for more accurate, yet efficient, ways of obtaining an approximation of 
6/i(xj). In the scheme we used in these tests, we simply search the entire data set for the closest 
point of the opposite type. In the worst case this takes operations, which is easily within the 
realm of practical computations. Increasing the accuracy of the approximation is a more difficult 
issue and should involve deeper geometric information from the data set. 

In addition to better determination of {bi}, use of other methods of nonlinear regression, in- 
cluding SVM and RBFN regression, with SDF-based classification should be explored. Another 
area for future exploration is development of iterative methods for the nonlinear classifier. We have 
described two possible procedures for this iteration and implemented one in a linear setting. 

Smale and coworkers have been developing methods for rigorous estimates for the least squares 
regression algorithm outlined in §2.5. They produce these estimates in the framework of Reproduc- 
ing Kernel Hilbert Spaces which have been shown to be isomorphic to certain Sobolev spaces |21j . 
including the space H^^^, to which signed distance function are known to belong The estimates 
could be used in our context if the accuracy of the initial estimates {bi}^i are known. For the 
naive method of determining these values, an upper bound is given by © and we hope to obtain 
better bounds under assumptions on A. Other geometric methods for approximating {bi}Y^^ should 
lend themselves to rigorous analysis depending on the methods. Perhaps for the case of iterative 
methods, the iteration process could be linked to known results in PDE and related functional 
analysis. Such a link would make an extremely rich arena of knowledge available for the purposes 
of estimates. 

The above estimates of the approximated SDF should not only result in a overall reliability 
measure of the method, but should provide for any given test point an estimate of the distance 
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of that test point from the decision surface. Combining this with statistical knowledge of the 
underlying application could provide a very natural "level of confidence" measure for any given 
test data. Such estimates would be especially useful in the context of biomedical applications. 

There are concrete mathematical reasons why the SDF is a better basis than the indicator 
function for use in classification. The SDF is fundamentally geometric and this connects it solidly to 
geometric and analytical tools and methods. In preliminary tests, we have shown that a naive, non- 
optimized implementation of SDF-based classification is non-trivially more accurate than standard 
methods on geometric problems. In preliminary tests on nonlinear, high dimensional and noisy 
data, we have demonstrated that a non-optimized implementation of SDF is at least as accurate as 
current, standard SVM methods. These observations and results indicate that the SDF paradigm 
has the potential to be the basis for more accurate binary classification algorithms in many contexts. 
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